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TITLE OF THE INVENTION 

IMAGE ENCODING^ AND _ DECi2DJ[lIG^EI^aD_^C^PE VI CE 



BACKGROUND OF THE INVENTION 
Field of the Invention: 

This invention relates to an image encoding and 
decoding method, image encoding and decoding device, 
and more specifically, to a method of synthesizing 
interframe predicted images by calculating motion 
vectors of pixels in an image, by performing 
interpolation/extrapolation of motion vectors of 
representative points. 

Background of the Invention: 

In high efficiency encoding of a moving image, 
it is known that interframe prediction (motion 
compensation) which uses similarities between frames 
produced at different times has a major effect on data 
compression. The motion compensating system that has 
become the mainstream of current image encoding 
technique is the block matching scheme adopted in H.261, 
MPEG 1 and MPEG 2 which are the international standards 
for moving image coding. In this system, the image to 
be encoded is divided into a large number of blocks, 
and motion vectors are calculated for each block. 

Block matching is currently the most widely 
used compensation technique, but when the whole image 
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is enlarged, reduced or rotated, motion vectors have to 
be transmitted for all blocks so the coding efficiency 
is poor. To deal with this problem, global motion 
compensation has been proposed wherein the motion 
5 vectors of the whole image are represented by a smaller 
number of parameters (e.g. M. Hotter, "Differential 
estimation of the global motion parameters zoom and 
pan", Signal Processing, vol. 16, no. 3, pp. 249-265, 
Mar. 1989). In this system, the motion vector (ug(x,y), 
10 vg(x,y)) of a pixel (x, y) is expressed in the form: 

u g (x,y) = a 0 x + a l y + a 2 
v g (x,y) = a 3 x + a i y + a 5 
(1) 

or 

15 

u g (x, y) = b Q xy + b^x + b 1 y + b- i 
v g (x,y) = b 4 xy + b 5 x + b 6 y + b 7 ■ 
(2) 

and motion compensation is performed using this motion 
vector. Herein, a0-a5, b0-b7 are the motion parameters. 
20 When motion compensation is performed, the predicted 

image on the transmitting side and receiving side must 
be the same. For this purpose, the transmitting side 
can transmit the values of a0-a5 and b0-b7 directly to 
the transmitting side, however the motion vectors of 
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plural representative points may be transmitted instead. 
Assume that the coordinates of pixels at the upper left, 
upper right, lower left and lower right corners of an 
image are respectively (0,0), (r,0), (0,s), (r,s) where 
r and s are positive integers. If the horizontal and 
vertical components of the motion vectors of the 
representative points (0,0), (r,0), (0,s) are 
respectively (ua,va), (ub,vb), (uc,vc), equation (1) 
may be rewritten as: 



u b -u a 
u g (x,y) = — - — 



v„(x,y) = — -x + — -y + v a 

k r s 

(3) 

This means that the same functions can be 
achieved by transmitting ua, va, ub, vb, uc, vc, 
instead of a0-a5. In the same way, using the 
horizontal and vertical components (ua,va), (ub,vb), 
(uc,vc), (ud,vd) of the motion vectors of the four 
representative points (0,0), (r,0), (0,s), (r,s), 
equation (2) may be rewritten as: 
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s-yfr-x x ^ yfr-x x ^ 

tjx,y) = u a +-u b + — u c +-u d 

g s v r r J s\ r r J 

u n — u h — u r +Uj —u n +u h —u„ + u r 

__2 * £ -xy + — -x + — -y 

rs r s 

v — v, — v, + v H — v n + v A — v„ + v r 

, f v ,,\ 1 ° i — -v- _l_ — 



Therefore, the same functions can be achieved 
by transmitting ua, va, ub, vb, uc, vc, ud, vd instead 
5 of b0-b7. This situation is shown by Fig. 1. If 

global motion compensation is performed between an 
original image 102 and a reference image 101 in the 
current frame, motion vectors 107, 108, 109, 110 of 
representative points 103, 104, 105, 106 (wherein 

10 motion vectors are defined as starting at points in the 
original image and terminating at corresponding points 
in the reference image of the current frame) , may be 
transmitted instead of the motion parameters. In this 
specification, the system using equation (1) is 

15 referred to as global motion compensation based on 
linear interpolation/extrapolation, and the system 
using equation (2) is referred to as global motion 
compensation based on bilinear 
interpolation /extrapolation . 

20 Warping prediction is the application of this 

global movement compensation processing to a smaller 
area of the image. An example of warping prediction 
• using bilinear interpolation/extrapolation is shown in 



4 



Fig. 2. Fig. 2 shows processing to synthesize a 
predicted image of an original image 202 of a current 
frame using a reference image 201. First, the original 
frame image 202 is divided into plural polygon-shaped 
5 patches to give a patched image 209. The apex of a 
patch is referred to as a lattice point, and each 
lattice point is common to plural patches. For example, 
a patch 210 comprises lattice points 211, 212, 213, 214, 
and these lattice points are also the apices of other 

10 patches. After dividing the image into plural patches 
in this way, motion estimation is performed. In the 
example shown here, motion estimation is performed on 
lattice points using the reference image 201. As a 
result, each patch in a reference image 203 obtained 

15 after motion estimation is transformed. For example, 
the patch 210 corresponds to a transformed patch 204. 
This is because, in motion estimation, it was estimated 
that the lattice points 205, 206, 207, 208 corresponded 
to 211, 212, 213, 214. 

20 The motion vectors of the lattice points are 

thereby found, and an interframe predicted image is 
synthesized by calculating the motion vector for each 
pixel in the patch by bilinear interpolation. The 
processing of this warping prediction is basically the 

25 same as the global motion compensation shown in Fig. 1, 
and the "motion vectors at the corners of the image" 
are transformed into "motion vectors of the lattice 
points". If a triangular patch is used instead of a 
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rectangular patch, warping prediction may be realized 
by linear interpolation/extrapolation. 

Examples of encoding and decoding techniques to 
simplify global motion compensation for representing 
5 motion vectors of the whole image using a small number 
of parameters may be found in the applicant's 
inventions "Image Encoding and Decoding Methods" 
(Japanese Unexamined Patent Publication: Application No. 
Hei 8-60572) and "Methods of Synthesizing Interframe 

10 predicted images" (Japanese Unexamined Patent 
Publication: Application No. Hei 8-249601) . 

By introducing global motion compensation or 
warping prediction described above, the motion of the 
image can be expressed with fewer parameters, and a 

15 high data compression rate can be achieved. However, 
the processing amount in encoding and decoding is 
greater than in the conventional method. In particular, 
the divisions of equations (3) and (4) are major 
factors in making the processing complex. In other 

20 words, in global motion compensation or warping 

prediction, a problem arises in that the processing 
amount required to synthesize predicted images is large. 
SUMMARY OF THE INVENTION 

It is therefore an object of this invention to 

25 reduce the computing amount by replacing the divisions 
involved in motion compensation encoding and decoding 
by a binary shift computation using registers with a 
small number of bits. 
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In order to achieve the aforesaid objective, 
this invention provides an image encoding and decoding 
5 method for synthesizing an interframe predicted image 
by global motion compensation or warping prediction 
wherein global motion vectors are found by applying a 
two-stage interpolation/extrapolation to motion vectors 
of plural representative points having a spatial 

10 interval with a characteristic feature. More 

specifically, this invention provides a method of 
synthesizing an interframe predicted image wherein, 
when the motion vector of a pixel is calculated by 
performing bilinear interpolation/extrapolation on 

15 motion vectors of four representative points of an 
image where the pixel sampling interval in both the 
horizontal and vertical directions is 1 and the 
horizontal and vertical coordinates of the sampling 
points are obtained by adding tointegers (where w=wn/wd, 

20 wn is a non-negative integer, wd is the hw power of 2, 
hw is a non-negative integer and wn<wd) , there are 
representative points at coordinates (i,j), (i+p, j), 
(i, j+g) , (i+P/ j+q) (where i, j, p, q are integers), 
the horizontal and vertical components of the motion 

25 vectors of representative points take the values of 

integral multiples of 1/k (where k is the hk power of 2, 
and hk is a non-negative integer), and when the motion 
vector of a pixel at the coordinates (x+w, y+w) is 
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found, the horizontal and vertical components of the 
motion vector at the coordinates (x+w, j) are found by 
linear interpolation/extrapolation of motion vectors of 
representative points at coordinates (i, j), (i+p, j), 
5 as values which are respectively integral multiples of 
1/z (where z is the hz power of 2, and hz is a non- 
negative integer) , and after finding the horizontal and 
vertical components of the motion vector at the 
coordinates (x+w, j+q) by linear 

10 interpolation/extrapolation of motion vectors of 

representative points at coordinates (i, j+q) , (i+p, 
j+q) , as values which are respectively integral 
multiples of 1/z (where z is the hz power of 2, and hz 
is a non-negative integer) , the horizontal and vertical 

15 components of the motion vector of the pixel at the 
coordinates (x+w, y+w) are found by linear 
interpolation/extrapolation of the aforesaid two motion 
vectors at the coordinates (x+w, j), (x+w, j+p) as 
values which are respectively integral multiples of 1/m 

20 (where m is the hm power of 2, and hm is a non-negative 

integer) . 

This invention makes it possible to perform 
divisions by means of shift computations by 
appropriately selecting representative point 
25 coordinates, and to implement the aforesaid motion 
compensation scheme using registers having a small 
number of bits by reducing the number of shift bits in 
the shift computations. 



BRIEF DESCRIPTION OF THE DRAWINGS: 

Fig. 1 is a diagram showing an example of 
global mtion compensation for transmitting motion 
5 vectors of representative points. 

Fig. 2 is a diagram showing an example of 
warping prediction. 

Fig. 3 is a diagram showing an example of the 
position of representative points for performing high 
10 speed processing. 

Fig. 4 is a diagram showing a typical 
construction of a software image encoding device. 

Fig. 5 is a diagram showing a typical 
construction of the software image decoding device. 
15 Fig. 6 is a diagram showing a typical 

construction of an image encoding device according to 
this invention. 

Fig. 7 is a diagram showing a typical 
construction of the image encoding device according to 
20 this invention. 

Fig. 8 is a diagram showing a typical 
construction of a motion compensation processor 616 of 
Fig. 6. 

Fig. 9 is a diagram showing another typical 
25 construction of the motion compensation processor 616 
of Fig. 6. 
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Fig. 10 is a diagram showing a typical 
construction of a predicted image synthesizer 711 of 
Fig. 7. 

Fig. 11 is a diagram showing a typical 
5 construction of a predicted image synthesizer 1103 of 
Fig. 9. 

Fig. 12 is a diagram showing a typical 
construction of a global motion compensation predicted 
image synthesizer. 
10 Fig. 13 is a diagram showing an example of a 

processing flowchart in the software image encoding 
device . 

Fig. 14 is a diagram showing an example of a 
motion compensation processing flowchart in the 
15 software image encoding device. 

Fig. 15 is a diagram showing an example of a 
processing flowchart in the software image decoding 
device . 

Fig. 16 is a diagram showing an example of a 
20 predicted image synthesis flowchart in the software 
image decoding device. 

Fig. 17 is a diagram showing a specific example 
of a device using image encoding/decoding which 
synthesizes a global motion compensation predicted 
25 image by two-stage processing. 

Preferred Embodiments of the Invention: 
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This invention is an application of an 
invention relating to a method of accelerating the 
computation involved in global motion compensation and 
warping prediction already proposed by the applicant 
5 (Application No. Hei 08-060572 and Application No. Hei 
08-249601) . This invention will be described in the 
context of its application to global motion 
compensation, but it may also be applied to warping 
prediction wherein identical processing to that of 
10 global motion compensation is performed. 

In the following description, it will be 
assumed that the pixel sampling interval is 1 in both 
the horizontal and vertical directions and a pixel 
exists at a point where horizontal and vertical 
15 coordinate is obtained by adding w (where w=wn/wd, wn 
is an integer which cannot be negative, wd is a 
positive integer and wn<wd) to integers w represents 
the coordinate phase shift the coordinates of a 
representative point in global motion compensation, and 
20 a pixel. Typically it has the values 0, 1/2, 1/4. 

Further, it will be assumed that the numbers of pixels 
of the image in the horizontal and vertical directions 
are respectively r and s (where r and s are positive 
integers), and image pixels lie in a range such that 
25 the horizontal coordinate is from 0 to less than r, and 
the vertical coordinate is from 0 to less than s. 

When motion compensation is performed using 
linear interpolation/extrapolation (affine 



transformation) or bilinear interpolation/extrapolation 
(co-lst order transformation) , and quantization is 
performed on the motion vector of each pixel, 
mismatches are prevented and computations are 

5 simplified (Japanese Unexamined Patent Publication: 

Application No. Hei 06-193970) . Hereafter, it will be 
assumed that the horizontal component and vertical 
component of the motion vector of a pixel are integral 
multiples 1/m (where m is a positive integer) . It will 

10 moreover be assumed that the global motion compensation 
used for motion vectors of representative points 
described in "Background of the Invention" will be 
performed, and that the motion vectors of 
representative points are integral multiples of 1/k 

15 (where k is a positive integer) . In this specification, 
the term "motion vector of a pixel" implies a motion 
vector used in order to actually synthesize a predicted 
image when performing global motion compensation. 

On the other hand, the term "motion vector of a 

20 representative point" means a parameter used to 

calculate a motion vector of pixel. Therefore, it may 
occur that the motion vector of a pixel and the motion 
vector of a representative point do not coincide even 
if they are located at the same coordinates due to 

25 differences of quantization step size, etc. 

First, global motion compensation using linear 
interpolation/extrapolation will be described referring 
to Fig. 3. In this example, instead of taking a 



representative point situated at the corner 301 of the 
image, representative points 302, 303, 304, 305 are 
generalized at (i, j) (i+p, j) (i, j+q) (i+p, j+q) 
(where i, j , p, q are integers). The points 302, 303, 
5 304, 305 may be situated inside or outside the image. 

If the horizontal and vertical components of the motion 
vectors of representative points multiplied by k are 
respectively (u0,v0), (ul,vl), (u2,v2), (u3,v3) (where 
uO, vO, ul, vl, u2, v2, u3, v3 are integers), the 
10 values obtained by multiplying the horizontal and 
vertical components of a motion vector of a pixel 
situated at (x+w, y+w) by m, i.e., (u(x+w, y+w) , v(x+w, 
y+w) ) , may be expressed by the following equation when 
w=0: 

15 

u(x + w,y + w) = u(x,y) 

= ((( j + q- yW + P- x >o + ( x ~ 0"i ) 

+0 - J'W + p-x)u 2 +(x- i)u 3 ))m) I /(pqk) 

v(x + w,y + w) = v(x,y) 

= (((y + g- yW +P~ *K +(x- 0v, ) 

+(y - JW + p-x)v 2 +(x- i)v 3 ))m) 1 1 {pqk) 
(5) 

where x, y, u(x,y), v(x,y) are integers, 
[//] is a division which rounds up the result of an 
20 ordinary computation to the nearest integer when the 

result is not an integer, and the order of priority as 
an operator is equivalent to multiplication and 
division. It is desirable to round off non-integral 
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values to the nearest integer so as to minimize 
computing errors. In this regard, there are three 
methods of rounding up the sum of 1/2 and an integer, 
i.e.: 

5 (1) round the value down to the next lowest integer, 

(2) round the value up to the next highest integer, 

(3) round the value down when the dividend is negative, 
and round the value up when the dividend is positive 
(assuming the divisor is always positive) . 

10 (4) round the value up when the dividend is negative, 
and round the value down when the dividend is positive 
(assuming the divisor is always positive) . 

In (3) and (4), the direction of rounding does 
not change depending on whether the dividend is 

15 positive or negative, and these methods therefore offer 
an advantage from the viewpoint of processing amount to 
the extent that it is unnecessary to determine positive 
or negative. High speed processing using (3) can be 
performed using for example the following equation (6) . 

20 

u(x + w, y + w) = u(x, y) 

= {Lpqk + ((j + q- y)((i + p- x)u 0 +(x- z>, ) 

+(y - JW + P~ x)«2 +( x ~ z >3 ))m + i(pqk)#2)) 

#(pqk)-L 
v(x + w,y + w) = v(x, y) 

= ( Mpqk + «j + q - y)((i + p- x)v 0 +(x- z>, ) 

+(y ~ JW + P- *> 2 +(x- /)v 3 ))m + ((pqk)#2)) 

#(pqk)-M 

(6) 
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where "#" is an integer division wherein digits after 
the decimal point are discarded, and the order of 
priority of computation is assumed to be the same as 

5 that of multiplication and division. Generally, this is 
the type of division that can be realized most easily 
by a computer. L and M are numbers for ensuring that 
the dividend is always positive, and are positive 
integers which are sufficiently large. The term 

10 (pqk#2) is used to round off the division result to the 

nearest integer. 

Processing in terms of integers in itself 
contributes to reducing the amount of processing. 
However, if p, q, k are respectively set equal to the 

15 a, B r hk power of 2 (where a, J3 , hk are non-negative 
integers) , the calculation of equation (5) can be 
performed by a shift computation of a + /3+hk bits, and 
the amount of processing performed by the computer and 
special hardware can be largely reduced. If m is set 

20 equal to the hm power of 2 (where hm is a non-negative 
integer, and hm<o: + j3+hk, equation (6) may be written 
as : 
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u(x + w, y + w) = u(x,y) 

= ((2L + l)«(a + p+h k -h m -\) 

+U + 9- yW + p~ *H + O - 0«i ) 
+(y - jW +p- x >2 +( x ~ 0" 3 )) 

))(a + j3 + h k -h„,)-L 
v(x + w, y + w) = v(x, y) 

= ((2M + l)«(a + j3+h k - h m - 1) 

+U + q- yXQ +p- x >q + ( x ~ 0^ ) 
+(y - JW +p- x >2 +( x ~ Ov 3 )) 

))(a + /3+h k -h m )-M 
(7) 

((x«a) means that x is shifted by a bits to the left, 
and 0 is set in the lower a bits. (x»a) means that x 
is shifted by a bits to the right, and 0 is set in the 
upper a bits. The order of priority of these operators 
is intermediate between addition/subtraction and 
multiplication/division) . Therefore the number of 
shift bits may be written as a + /3+hk-hm. 

When w is not 0, according to the definition 
w=wn/wd, equation (5) may be rewritten by the following 
equation (8) : 



w d w d \ 
u(x + w, y + w) = u{x + — ,y + — ) 

= ((( W dJ + W d1 ~ W d)> - Wn )(( W d i + W dP~ W d X ' W n K 

+(w d x + w n -wj)u x ) 

H w dy + w »~ w dJ)(i w d i + w dP - w d x ~ w ,i ) u 2 

+(w d x + w n - w d i)u 3 ))m) 

1 1 (w/pqk) 

W d W d N 

v(x + w, y + w) = v(x + — ,y + — ) 

= (((wJ + w d9 - w d y - w « )(( w rf z ' + w dP - w d x - w „ K 

H W d y + W fl _ W dJ)(( W d i + W dP ~ W d X ~ W n ) V 2 

+(w d x + w n - w d i)v 3 ))ni) 

1 1 (w/pqk) 

(8) 

If wd is the hw power of 2 and hw is a non- 
negative integer, the division by (p . q. k. wd . wd) becomes 
5 a shift computation of a + j3+hk+2hw bits, and the 
division may therefore be replaced by a shift 
computation as in the case of w=0. Also as in the case 
of equation (7), if hm< a + j3 +hk+2hw, the number of 
shift bits can be reduced to a + j3 +hk+2hw-hm bits by 

10 dividing both the numerator and denominator by m. 

Therefore, provided that wd is a hw power of 2, the 
processing when w=0 and when w^O is basically the same. 
Hereafter, although the equations are somewhat complex, 
the case w^O will be described. To find the 

15 calculation results for w=0, the substitutions wn=0, 
wd=l, hw=0 may be made. 
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To obtain the same global motion compensation 
predicted image on the transmitting and receiving sides, 
information about motion vectors of representative 
points must be transmitted to the receiving side in 
5 some form or other. In one method, the motion vectors 
of representative points are transmitted without 
modification, and in another method, the motion vectors 
of the corners of the image are transmitted and the 
motion vectors of representative points are calculated 
10 from these values. Hereafter, the latter method is 
described. 

Assume that the motion vectors of the four 
corners (-c, -c) , (r-c, -c) (-c, s-c) , (r-c, s-c) of 
the image can be only integral multiples of 1/n (where 

15 n is a positive integer, c=cn/cd, cn is a non-negative 
integer, cd is a positive integer and cn<cd) , and that 
(uOO, vOO), (uOl, vOl), (u02, v02), (u03, v03) which 
are the horizontal and vertical components of these 
vectors multiplied by n, are transmitted as global 

20 motion parameters, c represents a phase shift between 
the corners and representative points, and it typically 
has a value of 0, 1/2 or 1/4. (uO, vO) , (ul, vl) , (u2, 
v2), and(u3, v3) which are the horizontal and vertical 
components of the motion vectors respectively at the 

25 points (i, j), (i+p, j ) , (i, j+q),and (i+p, j+q) 
multiplied by k, may be defined as: 



u 0 = u'(i,j) 

v 0 = v '( i >j) 
u x =u'(i + p,j) 
v, =v'(i + p,j) 
u 2 =u'(i,j + q) 

u 3 = u'(i + p,j + q) 
v 3 = v'(i + p,j + q) 
(9) 

where u'(x, y) , V (x, y) are defined by transforming 
equation (5) : 

u'(x, y) = (((c d s - c„ - c d y)((c d r - c„ - c d x)u m + (c d x + c„ )u m 
+(c d y + c n ){{c d r - c„ - c d x)u 02 + (c d x + c n )u 03 ))k) 
1 1 l{c d 2 rsn) 

v'(x, y) = (((c d s - c„ - c d y){{c d r - c„ - c d x)v 00 + (c d x + c„ )v 0] ) 
+(c d y + c„)((c rf r - c„ - c d x)v 02 + {c d x + c n )v 03 ))k) 
1 1 l{c d rsn) 

(10) 



[///] is a division wherein the computation 
result is rounded to the nearest integer when the 

10 result of an ordinary division is not an integer, and 
the order of priority is equivalent to multiplication 
and division. In this way, if (uO, vO), (ul, vl), (u2, 
v2) , and (u3, v3) are calculated and global motion 
compensation is performed at the representative points 

15 (i, j), (i+p, j), <i, j+q)/ and (i+P/ j+q)^ global 
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motion compensation at the representative points (-c, - 
c) , (r-c, -c) (-c, s-c) , and (r-c, s-c) can be 
approximated. As described hereabove, if p and q are 
non-negative integral powers of 2, the processing can 
5 be simplified. In general, it is preferable not to 

perform extrapolation when calculating motion vectors 
of pixels in the image by equation (5) . This is to 
avoid increase of quantization error in motion vectors 
of representative points due to extrapolation. For 

10 this reason, it is desirable that all the 

representative points are in such positions that they 
surround the pixels in the image. Hence when i=j=c=0, 
it is appropriate that p and q are effectively the same 
as, or have slightly larger values than, r and s. Care 

15 must however be exercised as if the values of p and q 
are too large, the number of bits required for the 
calculation increases. 

To reduce computational error in the processing 
of equations (9) and (10), it is preferable that [///] 

20 rounds off non-integral values to the nearest integer. 
In this regard, the sum of 1/2 and an integer may be 
rounded off to the nearest integer by any of the 
aforesaid methods (l)-(4). However compared to the 
case of equation (5) (calculation performed for each 

25 pixel), equation (14) requires fewer computations (only 
four calculations for one image) , so even when the 
methods of equations (1) or (2) are chosen, there is 
not much effect on the total computational amount. 



If the values of p and q are set to non- 
negative integral powers of 2 as described in the above 
examples, the synthesis of interframe predicted images 
in global motion compensation is greatly simplified. 

5 However, there is still one other problem. Considering 
for example the case p=512, q=512, k=32, m=16, wd=2, 
wn=l (w=0.5), which are typical parameters in image 
coding, we have a + j3 +hk+2hw-hm=21 . This means that 
when u(x+w, y+w) is a value requiring 12 or more bits 

10 in binary form, a register of at least 33 bits is 
required to perform the high speed computation of 
equation (8) . When for example m=16, the value of 
u(x+w, y+w) is obtained by multiplying the horizontal 
component of the real motion vector by 16, so this 

15 could well be a value requiring 12 or more bits in 

binary form. At the present time, few processors have 
registers capable of storing integers of 33 or more 
bits, and they are expected to remain costly in future. 
Moreover in general, if the processor circuit is large, 

20 power consumption is correspondingly greater, so an 
algorithm requiring a large register is also 
disadvantageous from the viewpoint of power consumption. 
Therefore it is desirable that even when the division 
can be replaced by a shift computation, the number of 

25 shift bits is as small as possible. 

To resolve this problem, the two-step algorithm 
according to this invention which is described below 
may be used. Prior to calculating the motion vector of 



the pixel at the point (x+w, y+w) using the motion 
vectors of the representative points (i, j ) , (i+p, j) 
(if j+q) r (i+P/ j+q) / motion vectors at provisional 
representative points (i, y+w) and (i+p, y+w) are 
calculated so that the horizontal and vertical 
components are integral multiples of 1/z (where z is 
positive integer) . As in the aforesaid example, the 
horizontal and vertical components of the motion 
vectors of the representative points (i, j), (i+p, j) 
(i, j+q), (i+P, j+q) multiplied by k are taken to be 
respectively (uO, vO), (ul, vl ) , (u2, v2 ) , (u3, v3) 
(where uO, vO, ul, vl, u2, v2, u3, v3 are integers) . 
If the provisional representative points are situated 
at (i, y+w) and (i+p, y+w), (uL(y+w), vL(y+w)) and 
(uR(y+w), vR(y+w)) which are the horizontal and 
vertical components of the motion vectors of these 
provisional representative points multiplied by z, ar 
defined as follows: 











= (((w rf 7' + w rf 0- 




W n) U 0 +( W dy + W n 


-™j)u 2 )z')l 1 1 l(w d qK) 


v L (y + w) 
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■™ d y- 
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■ w (/ y - 


■w n )v x +(w d y + w„ 


-w d j)v,)z)llll{w d qk) 


(11) 









[////] is a division which rounds up the result 
of an ordinary computation to the nearest integer when 
the result is not an integer, and the order of priority 
is equivalent to multiplication and division. (The 
5 required function for [////] is the same as the [///] 
described above) . As (i, y+w) lies on a line joining 
(i, j) and (i, j+q) , (uL(y+w), vL(y+w)) can easily be 
found by a first order linear 

intrapolation/extrapolation using (uO, vO) and (u2, v2 ) . 

10 Likewise, as (i+p, y+w) lies on a line joining (i+p, j) 
and (i+p, j+q) may also be found by a first order 
linear interpolation/ext rapolat ion . 

By performing another first order linear 
interpolation/extrapolation on the motion vectors 

15 (uL(y+w), vL(y+w)) and (uR(y+w), vR(y+w)) of the 

provisional representative points found as described 
above, (u(x+w, y+w), v(x+w, y+w)) which are the 
horizontal and vertical components of the motion vector 
of the pixel at (x+w, y+w) multiplied by m, are found. 

20 This processing is performed by the following equation: 



u(x + w,y + w) = (((wj + w d p - w d x - w n )u L (y + w) 

+(w d x + w„- wJ)u R (y + w))m) I f(w d pz) 

v(x + w,y + w) = (((wj + w d p - w d x - w„ )v L (y + w) 

(w d x + w n - wj)v n (y + w))m) I l{w d pz) 

(12) 
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In the same way as described above, if p is the 
a power of 2, m is the hm power of 2, z is the hz 
power of 2, wd is the hw power of 2 (where a, hm, hz, 
ws are non-negative integers), the division by p.z.wd 

5 in equation (12) may be replaced by a+hz+hwhm bit 

right shift (where hm< a +hz+hw) . Also, if z=16 (hz=4), 
and the typical parameters p=512, q=512, k=32, m=16 / 
wd=2, wn=l (w=0.5) are used, the number of shift bits 
is 10, so the number of bits required for the register 

10 used in the computation can be largely reduced. It may 
be noted that in the above example, a motion vector is 
found by performing a first order linear 
interpolation/extrapolation in the vertical direction 
on the motion vector of a representative point, and the 

15 motion vector of a pixel is then found by performing a 
first order linear interpolation/extrapolation in the 
horizontal direction on the motion vector of this 
representative point. Conversely, the same result may 
be obtained by performing a first order linear 

20 interpolation/extrapolation in the horizontal direction 
when the motion vector of a representative point is 
found, and in the vertical direction when the motion 
vector of a pixel is found. 

In this scheme, the two steps of equations (11) 

25 and (12) are required to find a motion vector of a 
pixel, and at first sight it might appear that this 
would lead to an increase of computation amount. 
However if the motion vector of a provisional 



representative point is first found, this may be used 
for all r pixels on a line having the vertical 
coordinate y+w, so the percentage of the total 
processing amount due to equation (11) is very small. 
5 Therefore, the advantage (i.e. a smaller number of 
registers) gained by the lesser number of bits 
outweighs the disadvantage of increased computation 
amount due to having to perform the calculation of 
equation (11) . 

10 After obtaining the values (u(x+w, y+w), v(x+w, 

y+w)), (u(x+w, y+w), v(x+w, y+w)) may be divided into 
integral parts (ul (x+w, y+w), vl (x+w, y+w)) and 
fractional parts (uF(x+w, y+w), vF(x+w, y+w)) by the 
following processing. 

15 

w, (x + w, y + w) = {{Lm + u(x + w, y + w))))h m ) - L 
Vj (x + w, y + w) = (( Mm + v(x + w, y + w))))h m ) - M 
(13) 

u F (x + w,y + w) = u(x + w,y + w)-Uj(x + w,y + w)m 
v F (x + w,y + w) = v(x + w,y + w)-V;(x + w,y + w)m 
(14) 

20 where ul (x+w, y+w), vl (x+w, y+w) are integers 

expressing the integral parts of a motion vector of a 
pixel. uF(x+w, y+w), vF(x+w, y+w) are integers both 
having values from 0 to less than m expressing m times 
the fractional parts of a motion vector of a pixel. As 
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in the above example, m is the hm power of 2 (where hm 
is a non-negative integer) , and L and M are 
sufficiently large to make the shift values non- 
negative . 

5 When bilinear order interpolation is used as a 

method of interpolating luminance value, the luminance 
value of a pixel in the interframe predicted image can 
also be found by the following processing. When x' = 
x+w+ul (x+w, y+w) , y' = y+w+vl (x+w, y+w) , if the 

10 luminance values of pixels at (x', y'), (x'+l, y'), (x', 
y'+l), (x'+l, y'+D in the reference image are Ya, Yb, 
Yc, Yd, the luminance value Y (x+w, y+w) of a pixel at 
the point (x+w, y+w) in the interframe predicted image 
may be found by: 

15 

Y(x + w,y + w) = ((m - v F )((/« -u F )Y a + u F Y b ) 

+v F ((m - u F )Y C + u F Y d ) + (m 2 »1))»(2A„, ) 

(15) 

where uF, vF are respectively abbreviations for uF(x+w, 
y+w) , vF (x+w, y+w) . 

20 In equations (12) and (13), a+hz+hw-hm bit and 

hm bit right shifts are respectively performed. This 
means that if an ( a +hz+hw-hm) +hm= a +hz+hw bit right 
shift is performed in the calculation of equation (10), 
ul (x+w, y+w) and vl (x+w, y+w) can be calculated in one 

25 step. It is convenient if a+hz+hw is an integral 
multiple of 8 because, in general, the size of a 
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processor register is a multiple of 8 bit units. 
Frequently, two 8 bit registers (an upper bit register 
and a lower bit register) are linked to make a 16 bit 
register, or four 8 bit registers or two 16 bit 
registers are linked to make a 32 bit register. If the 
values of ul (x+w, y+w) , and vl (x+w, y+w) have already 
been calculated by, for example, a 16 bit shift 
computation, there is then no need to perform another 
shift computation. In other words, if the value prior 
to shift is stored in a 32 bit register, and the upper 
16 bits are used as a separate register, the values of 
ui (x+w, y+w) or vl (x+w, y+w) are stored in this 16 bit 
register . 

It will be appreciated that making the number 
of shift bits an integral multiple of 8 facilitates not 
only the processing of equation (10) but all aspects of 
the shift computation. It is particularly important to 
make processing easier when a large number of shift 
computations has to be performed (e.g. a shift 
computation for each pixel). Also, by first adding the 
same number of left shifts as the number of bits to the 
numerator and denominator, the number of right shifts 
due to division can be increased even when the number 
of shift bits is not an integral multiple of 8. For 
example, when a computation is performed by a 6 bit 
right shift, the same computation can be performed by 
an 8 bit right shift by first multiplying the value to 
which the shift operation is applied by 4 (this is 
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equivalent to performing a 2 bit left shift) . (Taking 
equation (5) which concerns u(x+w, y+w) as an example, 
this processing can be implemented by first multiplying 
uO, ul, u2, u3 by 4) . Care must however be taken when 

5 performing this processing that overflow of the value 
to be shifted does not occur. 

Many image encoders and decoders are able to 
adapt to a plurality of image sizes. In this case, 
when for example global motion compensation is 

10 performed using equations (12), (13), (14), the number 
of shift bits changes according to change of image size 
and can no longer be fixed to integral multiples of 8. 
This can be treated as follows. For example, consider 
the situation where an a+hz+hw bit right shift is 

15 required to calculate ul (x+w, y+w) and vl (x+w, y+w) as 
described above, where a can have a value in the range 
7-11. If hz=5, hw=l when a is less than 10, and hz=4, 
hw=l when a =11, the number of shift bits can be 
arranged to be always 16 or less. As stated 

20 hereintof ore, when the number of shift bits is less 
than 16, it can be simulated to be 16 by first 
multiplying the value to be shifted by a constant. 
Hence, when the image size changes, the number of shift 
bits can be controlled to a convenient number by 

25 varying other parameters (e.g. quantization step size 
of motion vectors) accordingly. Care must be taken 
however not to make the quantization step size of the 



motion vectors so large that it causes an appreciable 
degradation of the decoded image. 

When the algorithm shown in this specification 
is applied to ordinary global motion compensation, the 
motion vector of a representative point is first found 
to a precision of 1/k pixels using the motion vectors 
of the corners of the image which have a precision of 
1/n pixels. Next, the motion vector of a provisional 
representative point is found to a precision of 1/z 
pixels using the motion vector of the representative 
point, and the motion vector of a pixel is found to a 
precision of l/.m pixels using the motion vector of this 
provisional representative point. When the motion 
vectors of the corners of the image are transmitted as 
a motion parameter, it is desirable to make k as large 
a value as possible in order to closely approximate 
bilinear interpolation/extrapolation by this parameter. 
However, the horizontal and vertical components of the 
motion vector of the representative point includes an 
error having an absolute value equal to or less than 
l/(2k) due to the effect of quantization. From the 
viewpoint of making the approximation more accurate, it 
is preferable to increase the precision of the motion 
vector also of the provisional representative point, 
however since the motion vector of the provisional 
representative point is found using the motion vector 
of the representative point, there is no advantage to 
be gained in calculating it with an accuracy greater 
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than that of the motion vector of the representative 
point. Therefore, it is preferable that z<k in order 
to suppress the number of bits required for the 
computation, and that m<z for the same reason. 

The above discussion has considered global 
motion compensation using bilinear 

interpolation/extrapolation, but the number of shift 
bits can be suppressed by introducing the same 
processing in the case of linear 

interpolation/extrapolation. For example, assume that 
the horizontal and vertical components of the motion 
vectors of representative points at (i, j), (i+p, j), 
and (i, j+g) (where i, j, p, q are integers) multiplied 
by k, are (uO, vO) , (ul, vl) , and (u2, v2) (where uO, 
vO, ul, vl, u2, v2 are integers). The horizontal and 
vertical components of the motion vectors of a pixel 
(x+w, y+w) multiplied by m, i.e. (u(x+w, y+w) , v(x+w, 
y+w) ) , can then be expressed as follows (where x, y, 
u(x+w, y+w), v(x+w, y+w) are integers and the 
definition of w is the same as above) . 

u(x + w, y + w) = (((w, - u 0 )(w d x + w n - w d i)q 

+(u 2 - u 0 )(w d y + w n - w d j)p + u d w d pq)m) 
1 1 (w d pqk) 

v(x + w,y + w) = (((v, - v 0 )(w d x + w„ — w d i)q 

+(v 2 - v 0 ){w d y + w„ - w d j)p + v 0 w d pq)m) 
l/(w d pqk) 

(16) 



In this case also, p, q, k, m, wd are 
respectively a, B r hk, hm and hw powers of 2 (where a, 
j3 , hk, hm and hw are non-negative integers), and if a> 
j3 , this equation may be rewritten as: 

u + (x + x,y + x) = (((Mj - u 0 )(w d x + w n - w d i)2 a ~ fi 

+(u 2 - u 0 )(w d y + w„ - wj) + u 0 w d p)m) I !{w d pk) 

v(jc + w,y + w) = (((Vj - v 0 )(w d x + w„ - w d i)2 a ~ p 

+(v 2 - v Q )(w d y + w„ - w d j) + v 0 w d p)m) I l{w d pk) 

(17) 

As in the case when bilinear 
interpolation/extrapolation is used, the integer part 
of the motion vector of the pixel at (x+w, y+w) can be 
found by an a+hk+hw bit right shift, therefore if a 
+hk+hw is arranged to be an integral multiple of 8, 
processing can be simplified for the same reason as 
given above. It should be noted that when a<(B, the 
number of shift bits is /3+hk+hw. 

The construction of the image encoding device 
and decoding device for performing image encoding and 
decoding according to this invention which uses the 
synthesis of interframe predicted images, will now be 
described. 

Fig. 6 shows the configuration of one 
embodiment of an image encoding device according to 
this invention. 
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The construction shown in the diagram is 
essentially the same as in the prior art encoding 
device excepting for the motion compensation processor 
616. 

5 A subtractor 602 calculates a difference 

between an input frame (original image of a current 
frame which is to be encoded) 601 and an output image 
613 (interframe predicted image) of an 

interf rame/intraf rame coding changeover switch 619, and 

10 outputs a differential image 603. This differential 

image is quantized by a quantizer 605 after converting, 
to DCT coefficients by a DCT converter 604 so as to 
give quantized DCT coefficients 606. These quantized 
DCT coefficients are output as transmission information 

15 to a transmission path, and are also used to synthesize 
an interframe predicted image in the encoder. 

The procedure for synthesizing the frame 
predicted image will now be described. 

The quantized DCT coefficients 606 pass through 

20 an inverse quantizer 608 and inverse DCT converter 609 
so as to give a decoded differential image 610 (same 
image as the differential image reproduced on the 
receiving side) . The output image 613 of the 
interf rame/intraf rame coding changeover switch 619 

25 (described later) is added to this in an adder 611, and 
a decode imaged 612 of the current frame (same image as 
the decoded image of the current frame reproduced on 
the receiving side) is thus obtained. This image is 
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first stored in a frame memory 614, and is delayed by 
the time for one frame. At this time, therefore, the 
frame memory 614 outputs a decoded image 615 of the 
immediately preceding frame. The decoded image of the 
5 immediately preceding frame and input image 601 of the 
current frame are input to a motion compensation 
processor 616, and the motion compensation processor 
616 synthesizes the aforesaid interframe predicted 
image. The configuration of this point will be 

10 described later. 

A predicted image 617 is input to the 
interf rame/intraf rame coding changeover switch 619 
together with a "0" signal 618. This switch changes 
over between interframe coding and intraframe coding by 

15 selecting either of these inputs. 

When the predicted image 617 is selected (Fig. 
6 shows this case) , interframe coding is performed. 
On the other hand, when a "0" signal is input, the 
input image is DCT encoded as it is and output to the 

20 transmission path, so intraframe coding is performed. 

To obtain a correctly decoded image on the 
receiving side, it is necessary to know whether 
interframe coding or intraframe coding was performed on 
the transmitting side. For this purpose, an 

25 identifying flag 621 is output to the transmission path. 
Finally, an H.261 encoded bit stream 623 is obtained by 
multiplexing the quantized DCT coefficients, motion 
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vectors and identifying flag information in a 
multiplexing unit 622. 

Fig. 7 shows a typical construction of a 
decoder 7 00 for receiving the encoded bit stream output 
5 by the encoder of Fig. 6. 

A bit stream 717 which is received is split 
into quantized DCT coefficients 701, motion vectors 702 
and intraf rame/interf rame identifying flag 703 by a 
demultiplexer 716. 

10 The quantized DCT coefficients 701 pass through 

an inverse quantizer 704 and inverse DCT converter 705 
so as to give a differential image 706. This 
differential image is added to an output image 715 of 
an interf rame/intraf rame coding changeover switch 714 

15 in an adder 707, and the result is then output as a 
decoded image 708. The interf rame/intraf rame coding 
changeover switch changes over the output depending on 
the interf rame/intraf rame coding identifying flag 703. 
The predicted image 712 used when interframe coding is 

20. performed is synthesized in a predicted image 

synthesizer 711. Here, positions are shifted according 
to the received motion vectors 702 relative to the 
decoded image 710 of the immediately preceding frame 
stored in a frame memory 709. In the case of 

25 intraf rame coding, the interf rame/intraf rame coding 
changeover switch merely outputs a "0" signal 713. 

Fig. 8 shows a typical construction of the 
motion compensation processor 616 of the image encoding 
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device which uses a global motion compensation scheme 
based on linear interpolation/extrapolation for 
transmitting motion vectors of representative points. 
Numbers which are the same as those of Fig. 6 denote 
5 the same components. Motion estimation relating to 
global motion compensation is performed between the 
decoded image 615 of the immediately preceding frame 
and the original image 601 of the current frame by the 
global motion estimating unit 802, and global motion 

10 compensation parameters (for example, values of the 

aforesaid ua, va, ub, vb, uc, vc, ud, vd) are estimated. 
Information 803 about these values is transmitted as 
part of motion information 620. A global motion 
compensation predicted image 8 04 is synthesized by a 

15 global motion compensation predicted image synthesizer 
808 using equation (3) , and is supplied to a block 
matching unit 805. Here, motion compensation (motion 
estimation and predicted image synthesis) by block 
matching is performed between the global motion 

20 compensation predicted image and original image of the 
current frame, and block motion vector information 806 
and a final predicted image 617 are thereby obtained. 
This motion vector information is multiplexed with 
motion parameter information in a multiplexer 807, and 

25 output as the motion information 620. 

Fig. 10 shows a typical construction of the 
predicted image synthesizer 711 of Fig. 7. Numbers 
which are the same as those of other diagrams denote 



the same components. A global motion compensation 
predicted image 804 is synthesized in the global motion 
compensation predicted image synthesizer 808 using the 
global motion compensation parameters 803 extracted 
5 from the motion information 702 in a splitting unit 
1002, relative to the decoded image 710 of the 
immediately preceding frame. The image 804 is supplied 
to a block matching predicted image synthesizer 1001, 
and the final predicted image 712 is synthesized using 

10 the block matching motion vector information 8 06 
extracted from the motion information 702. 

Fig. 9 shows another typical construction of 
the motion compensation processor 616. Numbers which 
are the same as those of Fig. 6 denote the same 

15 components. In this example, global motion 

compensation or block matching is applied to each block. 
Motion compensation is performed between the decoded 
image 615 of the immediately preceding frame and the 
original image 601 of the current frame, respectively 

20 by global motion compensation in a global motion 

estimating unit 902 and global motion compensation 
predicted image synthesizer 911, and by block matching 
in a block matching unit 905. A selection switch 908 
selects the most suitable scheme for every block 

25 between a predicted image 903 due to global motion 
compensation and a predicted image 906 due to block 
matching. Global motion compensation parameters 904, 
motion vectors 907 for each block and selection 



information 909 relating to global motion 
compensation/block matching are multiplexed by a 
multiplexer 910, and the result is output as the motion 
information 620. 
5 Fig. 11 shows a typical construction of a 

predicted image synthesizer 1103 of a decoder which 
decodes the bit stream generated by an image encoding 
device using a motion compensation processor 901. 
Numbers which are the same as those of other diagrams 

10 denote the same components. The global motion 

compensation predicted image 903 is synthesized in the 
global motion compensation predicted image synthesizer 
911 using global motion compensation parameters 904 
extracted from the motion information 702 in the 

15 splitting unit 1002, relative to the decoded image 710 
of the immediately preceding frame. The block matching 
predicted image 906 is synthesized in the block 
matching predicted image synthesizer 1101 using block 
matching motion vector information 907 extracted from 

20 the motion information 702 relative to the decoded 
image 710 of the immediately preceding frame. A 
selection switch 1104 selects either of these schemes 
for each block, i.e., the predicted image 903 due to 
global motion compensation or the predicted image 906 

25 due to block matching, based on the selection 

information 90 9 extracted from the motion information 
702. After this selection process is applied to each 
block, the final predicted image 712 is synthesized. 

3 7 
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Fig. 12 shows the structural configuration of 
the global motion compensation predicted image 
synthesizer according to this invention. It will be 
assumed that the motion vectors of the corners of the 
5 compensation image are transmitted as global motion 
parameters. The motion vectors of representative 
points are calculated by equations (9), (10) in a 
computing unit 1205 using information 1204 relating to 
motion vectors of the corners of the image. Using - 

10 information 1206 relating to the motion vectors of 
these representative points, the motion vectors of 
provisional representative points are calculated for 
each line using equation (11) in a computing unit 1207. 
Then, by using information 1208 relating to the motion 

15 vectors of these provisional representative points, 
motion vectors for each pixel are calculated from 
equation (12) in a computing unit 1209. At the same 
time, using information 1210 relating to the motion 
vectors of each pixel and the decoded image 1202 of the 

20 immediately preceding frame, a global motion 

compensation predicted image 1203 is synthesized and 
output by a processing unit 1211. 

In addition to a conventional image encoder or 
image decoder using a special circuit or chip, this 

25 invention may also be applied to a software image 

encoder or software image decoder using a universal 
processor . 



V 



Fig. 4 and Fig. 5 respectively show examples of 
a software image encoder 400 and software image decoder 
500. In the software encoder 400, an input image 401 
is stored in an input frame memory 402, and a universal 
5 processor 403 reads and encodes information from the 
input frame memory 402. The program required for 
driving the universal processor 403 is read from an 
storage device 408 comprising a hard disk or floppy 
disk, etc., and stored in a program memory 404. The 
10 universal processor encodes the information by using a 
processing memory 405. The encoded information output 
by the universal processor 403 is then stored in an 
output buffer 406, and output as an encoded bit stream 
407 . 

15 Fig. 13 shows a flowchart of the encoding 

software which runs on the software encoder shown in 
Fig. 4. 

First, image encoding is started in a step 1301, and 0 
is input to a variable N in a step 1302. Next, if the 

20 value of N is 100, in steps 1303, 1304, 0 is input to N. 
N is a frame number counter which is incremented by 1 
whenever processing of one frame is completed, and it 
can take a value in the range 0-99 when encoding is 
performed. When the value of N is 0, the frame being 

25 encoded is an I frame (motion compensation is not 

performed, and intraframe coding is performed for all 
blocks) , otherwise it is a P frame (a frame comprising 
blocks where motion compensation is performed) . This 



means that if the value of N is 100, one I frame is 
encoded after 99 P frames were encoded. The optimum 
value of N varies according to the performance of the 
encoder and the environment in which the encoder is 
5 used. In this example, the value 100 was used, but the 
value of N is not necessarily limited to 100. The 
determination and output of frame type (I or P) are 
performed in a step 1305. When the value of N is 0, 
'I' is output as frame type identifying information to 

10 the output buffer, and thereafter, the frames wherein 
coding is performed will be I frames. Herein, the 
expression "output to output buffer" means that part of 
the bit stream is output from the encoder to external 
devices after storing in the output buffer (406 in Fig. 

15 4) . When N is not 0, 'P' is output by the output 
buffer as frame type identifying information, and 
thereafter, the frames wherein coding is performed will 
be P frames. 

In a step 1306, the input image is stored in a frame 
20 memory A. The frame memory A described here denotes 
part of the memory area of the software encoder (for 
example, this memory area is reserved in a memory 405 
of Fig. 4) . In a step 1307, it is determined whether 
the frame currently being encoded is an I frame. If it 
25 is not an I frame, motion estimation/motion 
compensation is performed in a step 1308 . 

Fig. 14 shows the detailed processing which is 
performed in this step 1308. First, global motion 
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estimation is performed between the images stored in 
the frame memories A and B (the decoded image of the 
immediately preceding frame is stored in a frame memory 
B) in a step 1401, and the motion vectors of the 
5 corners of the image are output as global motion 

parameters by the output buffer. In a step 1402, the 
motion vectors of representative points are calculated 
using the motion vectors of the corners of this image 
by eguations (9), (10). Next, in a step 1403, 0 is 
10 input to a variable M. M represents the number of 
lines in the image. When M is 0, it means that the 
uppermost line of the image is being processed, and 
when M is a value obtained by subtracting 1 from the 
number of lines in the image, it means that the 
15 lowermost line of the image is being processed. 'By 
using the motion vectors of representative points 
calculated in the step 1402, the motion vectors of 
provisional representative points on the Mth line are 
calculated by equation (11) in a step 1404. Then, 
20 making use of the motion vectors of these provisional 
representative points, the motion vectors of all the 
pixels in the Mth line are calculated by equation (12), 
and the Mth line of the global motion compensation 
predicted image is synthesized using the decoded image 
25 of the immediately preceding frame which is stored in 
the frame memory B according to the calculated motion 
vectors in a step 1405. In a step 1406, 1 is added to 
the value of M. In a step 1407, if the value of M is 



equal to the number of lines in the image, the routine 
proceeds to a step 1408, and if it is not equal, the 
routine proceeds to the step 1404. When the processing 
of the step 1408 starts, the image due to global motion 
5 compensation is stored in a frame memory D. In the 

steps after the step 1408, block matching is performed. 
First, in the step 1408, motion estimation for every 
block is performed between the frame memory F and frame 
memory A (input image), the motion vectors of each 

10 block are calculated, and these motion vectors are 

output to the output buffer. Next, a predicted image 
is synthesized by block matching in a step 1409 using 
the motion vectors and the image stored in the frame 
memory F, and this is stored in a frame memory C as a 

15' final predicted image. In a step 1410, a differential 
image of the frame memories A and C is found, and this 
is stored in the frame memory A. 

Returning now to Fig. 13, immediately before 
the process in the step 1308 is started, when the 

20 current frame is an I frame, the input image is stored 
in the frame memory A, and when the current frame is a 
P frame, a differential image between the input image 
and predicted image is stored in the frame memory A. 
In the step 1308, DCT is applied to the image stored in 

25 this frame memory A, and the DCT coefficients 

calculated here are output to the output buffer after 
being quantized. Further, in a step 1310, inverse 
quantization and inverse DCT are applied to these 



quantized DCT coefficients, and the image obtained as a 
result is stored in the frame memory B. Next, it is 
again determined whether the current frame is an I 
frame, and when the image is not an I frame, the images 
5 in the frame memories B and C are added in a step 1312, 
and this result is stored in the frame memory B. • Here, 
the encoding of one frame is finished, and the image 
stored in the frame memory B immediately before 
processing of a step 1313 is performed is a 

10 reconstracted image of the frame for which encoding has 
just been completed (same as that obtained on the 
decoding side) . In the step 1313, it is determined 
whether the frame for which coding is complete is the 
last frame, and if it is the last frame, coding is 

15 terminated. When it is not the last frame, 1 is added 
to N in a step 1314, the routine returns to the step 
1303 again, and encoding of the next frame is started. 
It will be understood that although the flowchart 
described here relates to a method of applying block 

20 matching to the global motion compensation predicted 
image synthesized as a result of performing global 
motion compensation (method corresponding to a device 
using a motion compensation processor 801 of Fig. 8), a 
flowchart relating to a method of performing global 

25 motion compensation and global matching in parallel 
(method corresponding to a device using a motion 
compensation processor 901 of Fig. 9) can be prepared 
by making a slight modification. 



On the other hand, in the software decoder 500, 
an input encoded bit stream 501 is first stored in an 
input buffer 502, and read by a universal processor 503. 
The universal processor 503 decodes the information 
5 using a program memory 504 for storing a program read 
from an storage device 508 comprising a hard disk or 
floppy disk, etc., and a processing memory 505. The 
decoded image obtained is then stored in an output 
frame memory 506, and output as an output image 507. 

10 Fig. 15 shows a flowchart of decoding software 

which runs on a software decoding device shown in Fig. 
5. Processing is started in 1501, and in a step 1502, 
it is determined whether or not there is input 
information. Here, if there is no input information, 

15 decoding is terminated in a step 1503. When there is 
input information, frame type information is first 
input in a step 1504. The term "input" means that 
information stored in an input buffer 502 is read. In 
a step 1505, it is determined whether or not the read 

20 frame type information is 'I'. When it is not 'I', 

predicted image synthesis is performed in a step 1506. 
The details of the processing performed in this step 
1506 is shown in Fig. 16. 

First, in a step 1601, the motion vectors of 

25 the corners of the image are input. In a step 1602, 
the motion vectors of representative points are 
calculated by equations (9), (10) using the motion 
vectors of the corners of this image. Next, in a step 



1603, 0 is input to the variable M. M represents the 
number of lines in the image. When M is zero, it means 
that the uppermost line of the image is being processed, 
and when M is a value obtained by subtracting 1 from 

5 the number of lines of the image, it means that the 

lowermost line of the image is being processed. Using 
the motion vectors of representative points calculated 
in the step 1602, the motion vectors of provisional 
representative points on the Mth line are calculated by 

10 equation (11) in a step 1604. The motion vectors for 
all the pixels in the Mth line are calculated by 
equation (12) in a step 1605. From the calculated 
motion vectors, the Mth line of the global motion 
compensation predicted image is synthesized using the 

15 decoded image of the immediately preceding frame stored 
in a frame memory E, and this is stored in a frame 
memory G. The memory G herein means part of the area 
of the memory 505 of the software decoder. 
In a step 1606, 1 is added to the value of M. If the 

20 value of M is equal to the number of lines of the image 
in a step 1607, the routine proceeds to a step 1608, 
and if it is not equal, the routine shifts to the step 
1604. When the processing of the step 1608 is started, 
the predicted image due to global motion compensation 

25 is stored in the frame memory G. In the step 1608, 

block matching is performed. Motion vector information 
for each block is input, the predicted image due to 
block matching is synthesized using these motion 
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vectors and the image stored in the frame memory G, and 
this predicted image is stored in the frame memory D. 

Returning to Fig. 15, guantized DCT 
coefficients are input in a step 1507, and the image 
5 obtained by applying inverse guantization and inverse 
DCT to these is stored in the frame memory E. In a 
step 1508, it is determined whether or not the frame 
currently being decoded is an I frame. When it is not 
an I frame, the images stored in the frame memories D 
10 and E are added in a step 1509, and the resulting image 
is stored in the frame memory E. The image stored in 
the frame memory E immediately prior to performing the 
processing of the step 1510 is the reproduced image. 
In the step 1510, the image stored by this frame memory 
15 E is output to the output frame memory 506, and output 
from the decoder as an output image without 
modification. When decoding of one frame is completed 
in this way, processing returns again to the step 1502. 
When the software image encoder and software 
20 image decoder shown in Fig. 4 and Fig. 5 are made to 
execute a program implementing the method of 
synthesizing interframe predicted images described in 
this specification, global motion compensation or 
warping prediction can be performed with a smaller 
25 amount of computation. Compared to the case when this 
invention is not used, therefore, power consumption is 
reduced, devices are less costly to manufacture, images 
with more pixels can be processed in real time, and 



simultaneous parallel processing can be performed 
including processing other than encoding and decoding. 
Moreover, by using the algorithm shown in this 
specification, compressed image data which could not be 
5 reproduced in real time due to limitation of the 

computing ability of conventional encoders and decoders, 
can now be reproduced in real time. 

The embodiments of this invention described 
above further comprise the following embodiments. 

10 

(1) In conventional image encoding, error coding is 
performed using discrete cosine transformation or the 
like after interframe prediction, however this 
invention may be used for image encoding or decoding 

15 when the interframe predicted image is used as the 
reconstructed image without modification. 

(2) In the above description, it was assumed that the 
shape of the image was rectangular, however the 

20 invention may be applied equally well to images having 
any arbitrary shape other than rectangular. In this 
case, the processing of the invention may first be 
applied to a rectangle enclosing an image of arbitrary 
shape, and a computation performed to calculate motion 

25 vectors only of pixels in the image of arbitrary shape. 

(3) In the above specification, a motion vector 
interpolation/extrapolation algorithm was described 
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using two step processing wherein the value of p or q 
was a non-negative integer power of 2. However this 
two-step processing algorithm also has the effect of 
reducing the numerator of a division even when p and q 
5 are not non-negative integer powers of 2, and it is 
therefore effective for preventing overflow of 
registers . 

Field of the Invention 
10 Fig. 17 shows specific examples of an encoding/ 

decoding device using the prediction image synthesis 
method shown in this specification. 

(a) shows a case where an image coding/decoding 
software installed in a personal computer 1701 is used 

15 as the image encoding/decoding device. This software 
is recorded on some type of storage medium (CD-ROM, 
floppy disk or hard disk, etc.,), and read by the 
personal computer. Further, by connecting this 
personal computer to a communication line, the device 

20 can be used as an image communication terminal. 

(b) shows a case where a coded bit stream 
comprising moving image information encoded by the 
method of this invention and recorded on a storage 
medium 1702 is read and reconstructed by a reproducing 

25 device 1703 comprising a device according to this 
invention, and the reconstructed video signal is 
displayed on a television monitor 1704. The 
reproducing device 1703 may also simply read the coded 



bit stream, the decoding device being built into a 
television monitor 1704. 

(c) shows a case wherein the decoding device of 
this invention is built into a television receiver 1705 

5 for digital broadcasting. 

(d) shows a case wherein the decoder is built 
into a set-top box 1709 connected to a cable TV cable 
1708 or a satellite/terrestrial wave broadcasting 
antenna, and the image is reproduced on a television 

10 monitor 1710. 

Instead of the set-top box, the decoder may also be 
built into the television monitor as in the case of 
1704 of (b) . 

(e) shows a case where the encoder/decoder of 
15 this invention is built into a digital portable 

terminal 1706. The digital portable terminal may be a 
transmitting/receiving terminal comprising an 
encoder/decoder, a transmitting terminal only with an 
encoder, or a receiving terminal only with a decoder. 

20 (f) shows a case where the encoder is built 

into a camera 1707 for photographing moving images. 
The camera 1707 may simply acquire a video signal, and 
the signal be input to a special encoder 1711. In any 
of the devices or systems shown in the figure, the 

25 method described in this specification permits 

simplification of the device as compared with the case 
when prior art technology is used. 



WHAT IS CLAIMED: 

1. A method of synthesizing an interframe predicted 
5 image comprising: 

a first step for calculating the values of 
motion vectors of four representative points at 
coordinates (i,j), (i+p, j ) , (i, j+q), (i+P, j+q) 
(where i, j, p, q are integers, the horizontal and 

10 vertical components of the motion vectors of the 

representative points taking the values of integral 
multiples of 1/k where k is the hk power of 2, and hk 
is a non-negative integer) , 

a second step for calculating the motion 

15 vectors of a pixel at coordinates (x+w, y+w) by 

performing bilinear interpolation/extrapolation on the 
motion vectors of the four representative points of an 
image where the pixel sampling interval in both the 
horizontal and vertical directions is 1 and the 

20 horizontal and vertical coordinates of the sampling 
points are obtained by adding w to integers (where 
w=wn/wd, wn is a non-negative integer, wd is a hw power 
of 2, hw is a non-negative integer and wn<wd) , where 
the aforesaid second step comprised of : 

25 a third step for calculating the horizontal and 

vertical components of motion vectors at the 
coordinates (i, y+w) as numerical values which are 
respectively integral multiples of 1/z (where z is the 

5 0 



hz power of 2, and hz is a non-negative integer) by 
linear interpolation/extrapolation of the motion 
vectors of the representative points at the coordinates 
(i, j ) / (ir j+q)/ and for calculating the horizontal 
5 and vertical components of the motion vectors at the 

coordinates (i+p, y+w) as values which are respectively 
integral multiples of 1/z (where z is the hz power of 2, 
and hz is a non-negative integer) by linear 
interpolation/extrapolation of the motion vectors of 
10 the representative points at coordinates (i+p, j), (i+p, 
j+q) , and 

a fourth step for calculating the horizontal 
and vertical components of the motion vectors of the 
pixel at the coordinates (x+w, y+w) as values which are 
15 respectively integral multiples of 1/m (where m is the 
hm power of 2, and hm is a non-negative integer) , found 
by linear interpolation/extrapolation of the aforesaid 
two motion vectors at the coordinates (i, y+w) , (i+p, 
y+w) . 

20 

2. A method of synthesizing an interframe predicted 
image comprising: 

a first step for calculating the values of 
motion vectors of four representative points at 
25 coordinates (i,j), (i+p, j), (i, j+q), (i+P, j+q) 
(where i, j, p, q are integers, the horizontal and 
vertical components of the motion vectors of the 
representative points taking the values of integral 



multiples of 1/k where k is the hk power of 2, and hk 
is a non-negative integer) , 

a second step for calculating the motion 
vectors of a pixel at coordinates (x+w, y+w) by 
5 performing bilinear interpolation/extrapolation on the 
motion vectors of four representative points of an 
image where the pixel sampling interval in both the 
horizontal and vertical directions is 1 and the 
horizontal and vertical coordinates of the sampling 
10 points are obtained by adding w to integers (where 

w=wn/wd, wn is a non-negative integer, wd is a hw power 
of 2, hw is a non-negative integer and wn<wd) , where 
the aforesaid second step comprised of : 

15 a third step for calculating the horizontal and 

vertical components of motion vectors at the 
coordinates (x+w, j) as numerical values which are 
respectively integral multiples of 1/z (where z is the 
hz power of 2, and hz is a non-negative integer) by 

20 linear interpolation/extrapolation of the motion 

vectors of the representative points at the coordinates 
(i, j), (i+p, j), and for calculating the horizontal 
and vertical components of the motion vectors at the 
coordinates (x+w, j+q) as values which are respectively 

25 integral multiples of 1/z (where z is the hz power of 2, 
and hz is a non-negative integer) by linear 
interpolation/extrapolation of the motion vectors of 



the representative points at coordinates (i, j+q) , (i+p, 
j+q) , and 

a fourth step for calculating the horizontal 
and vertical components of the motion vectors of the 
pixel at the coordinates (x+w, y+w) as values which are 
respectively integral multiples of 1/m (where m is the 
hm power of 2, and hm is a non-negative integer), found 
by linear interpolation/extrapolation of the aforesaid 
two motion vectors at the coordinates (x+w, j), (x+w, 
j+P) • 

3. A method of synthesizing an interframe prediction 
image as defined in Claim 1, wherein, when the motion 
vectors of a pixel at the coordinates (x+w, y+w) are 
found using (uO, vO), (ul, vl), (u2, v2) , (u3, v3 ) , 
which are the horizontal and vertical components of the 
motion vectors of the representative points at the 
coordinates (i,j), (i+p, j), (i, j+q), (i+P, j+q) 
multiplied by k, (uL(y+w), vL(y+w)) which are the 
horizontal and vertical components of the motion 
vectors at a point having the coordinates (i, y+w) 
multiplied by z, are found by calculating: 

uL (y+w) = ( ( (q. wd-y .wd- 
wn) u0+ (y .wd+wn) u2 ) z) //// (q. k. wd) , 

vL ( y+w ) = ( ( ( q . wd-y . wd- 
wn) v0+ (y. wd+wn) v2) z) //// (q. k. wd) 

(where [////] is a division wherein the computation 
result is rounded to the nearest integer when the 



result of an ordinary division is not an integer, and 
the order of computational priority is equivalent to 
multiplication and division) , 

uR(y+w), vR(y+w)) which are the horizontal and 
5 vertical components of the motion vector at a point 

having the coordinates (i+p, y+w) multiplied by z, are 
found by calculating: 

uR ( y+w ) = ( ( ( q . wd-y . wd- 
wn) ul+ (y . wd+wn) u3)z)////(q.k.wd) 
10 vR(y+w)=( ( (p.wd-y.wd- 

wn) vl+ (y. wd+wn) v3) z)////(q.k.wd), and 

(u(x+w), y+w), v(x+w, y+w)) which are the 
horizontal and vertical components of the motion vector 
of a pixel at the coordinates (x+w, y+w) multiplied by 
15 m, are found by calculating: 

u(x+w, y+w) = ( ( (p. wd-x.wd- 
■ wn) uL (y+w) + (x. wd+wn) uR (y+w) )m) // (p. z . wd) 
v(x+w, y+w) = ( ( (p . wd-x . wd- 
wn) vL (y+w) + (x. wd+wn) vR (y+w) )m) //(p.z.wd) 
20 (where [//] is a division wherein the computation 
result is rounded to the nearest integer when the 
result of an ordinary division is not an integer, and 
the order of priority is equivalent to multiplication 
and division) . 

25 

4. A method of synthesizing an interframe prediction 
image as defined in Claim 2, wherein, when the motion 
vectors of a pixel at the coordinates (x+w, y+w) are 
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found using (uO, vO), (ul, vl) , (u2, v2 ) , (u3, v3) , 
which are the horizontal and vertical components of the 
motion vectors of the representative points at the 
coordinates (i,j), (i+p, j), (i# j+q) , (i+P/ j+q) 
5 multiplied by k, 

(uT(xtw), vT(x+w)) which are the horizontal and 
vertical components of the motion vectors at a point 
having the coordinates (x+w, j) multiplied by z, are 
found by calculating: 
10 uT (x+w) = ( ( (p. wd-x. wd- 

wn) u0+ (x.wd+wn) ul)z)////(p.k.wd), 

vT (x+w) = ( ( (p. wd-x. wd- 
wn) v0+ (x.wd+wn) vl)z)////(p.k.wd) 

(where [////] is a division wherein the computation 
15 result is rounded to the nearest integer when the 

result of an ordinary division is not an integer, and 
the order of computational priority is equivalent to 
multiplication and division) , 

uB(y+w), vB(y+w)) which are the horizontal and 
20 vertical components of the motion vectors at a point 

having the coordinates (x+w, j+p) multiplied by z, are 
found by calculating: uB (x+w) =(( (P • wd-x . wd- 
wn) u2+ (x.wd+wn) u3) z) //// (p. k.wd) vB (x+w) = ( ( (p. wd-x. wd- 
wn) v2+ (x.wd+wn) v3)z)////(p.k.wd), and 
25 (u(x+w), y+w) , v(x+w, y+w) ) which are the 

horizontal and vertical components of the motion 
vectors of a pixel at the coordinates (x+w, y+w) 
multiplied by m, are found by calculating: 
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u(x+w, y+w) =(( (q. wd-y .wd- 
wn) uT (x+w) + (y . wd+wn) uB (x+w) ) m) // (q. z . wd) 

v(x+w, y+w) =(( (q. wd-y .wd- 
wn) vT (x+w) + (y. wd+wn) vB (x+w) )m) // (q. z . wd) 
5 (where [//] is a division wherein the computation 
result is rounded to the nearest integer when the 
result of an ordinary division is not an integer, and 
the order of priority is equivalent to multiplication 
and by division) . 

10 

5. A method of synthesizing an interframe predicted 
image as defined in Claim 1, wherein the absolute value 
of p is the 0L power of 2 (where a is a non-negative 
integer) . 

15 

6. A method of synthesizing an interframe predicted 
image as defined in Claim 2 or 4, wherein the absolute 
value of q is the /3 power of 2 (where j3 is a non- 
negative integer) . 

20 

7. A method of synthesizing an interframe predicted 
image as defined in Claim 1, wherein the absolute 
values of p and q are respectively the a power of 2 
and j3 power of 2 (where a , /3 are non-negative 

25 integers) . 

8. A method of synthesizing an interframe predicted 
image as defined in Claim 2, wherein the absolute 
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values of p and q are respectively the a power of 2 
and J3 power of 2 (where a , /3 are non-negative 
integers) . 

5 9. A method of synthesizing an interframe predicted 
image as defined in Claim 5, wherein a+hz is a 
positive integral multiple of 8, and w is 0. 

10. A method of synthesizing an interframe predicted 
10 image as defined in Claim 6, wherein /3+hz is a 

positive integral multiple of 8, and w is 0. 

11. A method of synthesizing an interframe predicted 
image as defined in Claim 5, wherein a+hz+hw is a 

15 positive integral multiple of 8, and w>0. 

12. A method of synthesizing an interframe predicted 
image as defined in Claim 6, wherein 13 +hz+hw is a 
positive integral multiple of 8, and w>0 . 

20 

13. A method of synthesizing an interframe predicted 
image as defined in Claim 9, wherein the value of hz is 
varied according to the value of a so that a+hz is 16 
or less for plural different values of a . 

25 

14. A method of synthesizing an interframe predicted 
image as defined in Claim 10, wherein the value of hz 



is varied according to the value of j3 so that B +hz is 
16 or less for plural different values of B . 

15. A method of synthesizing an interframe predicted 
5 image as defined in Claim 11, wherein the value of hz 

is varied according to the value of a so that a+hz+hw 
is 16 or less for plural different values of a . 

16. A method of synthesizing an interframe predicted 
10 image as defined in Claim 12, wherein the value of hz 

is varied according to the value of B so that B +hz+hw 
is 16 or less for plural different values of B ■ 

17. A method of synthesizing an interframe predicted 

15 image as defined in any of Claims 1 to 16, wherein z>m. 

18. A method of synthesizing an interframe predicted 
image as defined in any of Claims 1 to 17, wherein k>z . 

20 19. A method of synthesizing an interframe predicted 

image as defined in any of Claims 1 to 18, wherein the 
absolute values of p and q are respectively different 
from the number of horizontal and vertical pixels in 
the image. 

25 

20. A method of synthesizing an interframe predicted 
image as defined in any of Claims 1 to 19, wherein, 
when r is the number of pixels in the horizontal 
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direction and s is the number of pixels in the vertical 
direction of the image (where r, s are positive 
integers), 1/2 of the absolute value of p is less than 
r, the absolute value of p is equal to or greater than 
5 r, 1/2 of the absolute value of q is less than s, and 
the absolute value of q is equal to or greater than s. 

21. A method of synthesizing an interframe predicted 
image as defined in any of Claims 1 to 19, wherein, 

10 when r is the number of pixels in the horizontal 

direction and s is the number of pixels in the vertical 
direction of the image (where r, s are positive 
integers) , the absolute value of p is equal to or less 
than r, twice the absolute value of p is larger than r, 

15 the absolute value of q is equal to or less than s, and 
twice the absolute value of q is larger than s. 

22. A method of synthesizing an interframe predicted 
image as defined in any of Claims 1 to 21, wherein, 

20 when the number of pixels in the horizontal and 

vertical directions of the image is respectively r and 
s (where r and s are positive integers) , and the pixels 
of the image lie in a range wherein the horizontal 
coordinate is from 0 to less than r and the vertical 

25 coordinate is from 0 to less than s, 

(uO, vO), (ul, vl), (u2, v2), (u3, v3) 
which are expressed by 



u' (x, y)=( ( (s.cd-cn-y.cd) ( (r . cd-cn-x . cd) uOO + 
(x.cd+cn)u01) + (y.cd+cn) ( (r.cd-cn- 
x.cd)u02+(x.cd+cn)u03) ) k) ///(r.s.n.cd), 
v' (x, y) =(( (s.cd-cn-y.cd) ( (r. cd-cn-x. cd) vOO + 

5 (x. cd+cn) vOl) + (y . cd+cn) ( (r . cd-cn- 

x.cd) v02+(x. cd+cn) v03) ) k) ///(r.s.n.cd), 
uO=u' (i, j ) 
vO=v' (i, j ) 
ul=u' (i+p, j) 

10 vl=v' (i+p, j) 
u2=u' (i, j+q) 
v2=v' (i, j+q) 
u3=u' (i+p, j+q) 
v3=v' (i+p, j+q) 

15 (where [///] is a division wherein the computation 

result is rounded to the nearest integer when the 
result of an ordinary division is not an integer, and 
the order of priority is equivalent to multiplication 
and division) , are used as the k times horizontal and 

20 vertical components of motion vectors of representative 
points (i,j), (i+P, j), (i, j+q), U+P/ j+q> , by using 
(uOO, vOO), (uOl, vOl), (u02, v02), (u03, v03) (where 
uOO, vOO, uOl, vOl, u02, v02, u03, v03 are integers), 
which are n times (where n is a positive integer) 

25 motion vectors at the corners of an image situated at 
the coordinates (-c, -c) , (r-c, -c) , (-c, s-c) , (r-c, 
s-c) (where c=cn/cd, cn is a non-negative integer, cd 
is a positive integer and cn<cd) , whereof the 
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horizontal and vertical components take the values of 
integral multiples of 1/n. 

23. An image encoding method using a method of 

5 synthesizing an interframe predicted image comprising: 
a first step for outputting a difference 
between an image signal of a current frame which is to 
be encoded and an interframe predicted image as a 
differential image, 
10 a second step for transforming the signal of 

said differential image to obtain a transformed signal 
which is then encoded, 

a fourth step for applying an inverse 
transformation to said transformed signal to produce a 
15 decoded differential image of said differential image, 
and 

a fifth step for producing an interframe 
predicted image signal for the frame immediately 
following said current frame image signal using said 
20 decoded differential image and said interframe 
predicted image, wherein 

said fifth step is performed by an interframe 
predicted image synthesis method as defined in any of 
Claims 1 to 16. 

25 

24. An image coding method as defined in Claim 23, 
wherein said fifth step comprises a step for detecting 
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and encoding information relating to motion vectors of 
the representative points. 

25. An image coding method as defined in Claim 23, 

5 wherein the representative points in said fifth step 
are the corners of the image. 

26. An image decoding method comprising: 

a first step for inputting an interframe coding 
10 signal of an image frame which is to be decoded and 

motion vector information concerning said image frame, 

a second step for transforming said interframe 
coding signal into a decoded differential signal, 

a third step for producing an interframe 
15 predicted image from a decoded image signal of another 
image frame different in time form said image to be 
decoded and said motion vector information, and 

a fourth step for adding the decoded 
differential signal and said interframe predicted image 
20 signal to obtain a decoded image signal of said image 
frame which is to be decoded, wherein 

said third step is performed by an interframe 
predicted image synthesis method as defined in any of 
Claims 1 to 16. 

25 

27. An image decoding method as defined in Claim 26, 
wherein said plural representative points are the 
corner points of said image used by reproducing 
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information relating to the motion vectors of the 
representative points directly encoded as encoded data. 

28. An image decoding method as defined in Claim 26, 
5 wherein said plural representative points are the 

corner points- of said image. 

29. An encoding device comprising an encoder which 
encodes an image signal of a current frame to be 

10 encoded, an interframe predicted image and part of the 
output of a first transforming unit which transforms 
the image signal, a second transforming unit which 
applies an inverse transformation to part of the output 
of the first transforming unit to obtain a decoded 

15 differential image of said differential image, decoding 
means which obtains a decoded image of the signal of 
the current frame from said decoded differential image 
and said interframe predicted image, and a motion 
compensating unit which adds the decoded image of said 

20 immediately preceding frame and the input image of said 
current frame to synthesize said interframe predicted 
image, wherein: 

said motion compensating unit comprises a 
global motion vector estimating unit for calculating 

25 the values of motion vectors of four representative 

points at coordinates (i+p, j ) , (i, j+q) / (i+P# 

j+q) (where i, j, p, q are integers, the horizontal and 
vertical components of the motion vectors of the 

6 3 



representative points taking the values of integral 
multiples of 1/k where k is the hk power of 2, and hk 
is a non-negative integer) of said decoded image of the 
immediately preceding frame from the decoded image of 
5 said immediately preceding frame and the input image of 
the current frame, and a predicted image synthesizing 
unit for producing an interframe predicted image which 
predicts the signal of the current frame to be encoded 
from said motion vectors and the decoded image of said 

10 immediately preceding frame, and 

said predicted image synthesizing unit 
comprises a computing unit for calculating the motion 
vectors of a point at coordinates (i, y+w) by 
performing bilinear interpolation/extrapolation on the 

15 motion vectors of representative points situated at 
coordinates (i, j ) , 

(i, j+g) of an image lying on a number of sampling 
points obtained by adding an integer w to the 
horizontal and vertical coordinates when a pixel 

20 sampling interval in both the horizontal and vertical 
directions is 1 (where w=wn/wd, wn is a non-negative 
integer, wd is a hw power of 2, hw is a non-negative 
integer and wn<wd) , calculating the horizontal and 
vertical components of motion vectors at the 

25 coordinates (i, y+w) as numerical values which are 

respectively integral multiples of 1/z (where z is the 
hz power of 2, and hz is a non-negative integer) by 
linear interpolation/extrapolation of the motion 
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vectors of representative points at the coordinates (i, 
j)f (ii- j+q)/ calculating the horizontal and vertical 
components of the motion vectors at the coordinates 
(i+p, y+w) as values which are respectively integral 
5 multiples of 1/z (where z is the hz power of 2, and hz 
is a non-negative integer) by linear 

interpolation/extrapolation of the motion vectors of 
the representative points at coordinates (i+p, j), (i+p, 
j+q), and calculating the horizontal and vertical 

10 components of the motion vectors of the pixel at the 

coordinates (x+w, y+w) as values which are respectively 
integral multiples of 1/m (where m is the hm power of 2, 
and hm is a non-negative integer) by linear 
interpolation/extrapolation of the aforesaid two motion 

15 vectors at the coordinates (i, y+w) , (i+p, y+w) , and 
a synthesizing unit for synthesizing a 
predicted image from the motion vectors of a pixel 
situated at the aforesaid coordinates (x+w, y+w) and 
the decoded image of said immediately preceding frame. 

20 

30. An encoding device comprising a subtractor which 
outputs a difference between an image signal of a 
current frame to be encoded and an interframe predicted 
image signal as a differential image, an encoder which 
25 encodes part of the output of a first transforming unit 
for transforming the signal of said differential image, 
a second transforming unit which applies an inverse 
transformation to part of the output of the first 
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transforming unit to obtain a decoded differential 
image of said differential image, decoding means which 
obtains a decoded image of the current frame from said 
decoded differential image and said interframe 
5 predicted image, and a motion compensating unit which 
uses the decoded image of said immediately preceding 
frame and an input image 601 of said current frame to 
synthesize said interframe predicted image, wherein 
said motion compensating unit comprises a 

10 global motion vector estimating unit for calculating 
the values of motion vectors of four representative 
points at coordinates (i+p, j), (i# j+q) , U+P, 

j+q) (where i, j , p, q are integers, the horizontal and 
vertical components of the motion vectors of the 

15 representative points taking the values of integral 

multiples of 1/k where k is the hk power of 2, and hk 
is a non-negative integer) of said decoded image of the 
immediately preceding frame from the decoded image of 
said immediately preceding frame and the input image of 

20 the current frame, and a predicted image synthesizing 
unit for producing an interframe predicted image which 
predicts the signal of the current frame to be encoded 
from said motion vectors and the decoded image of said 
immediately preceding frame, and 

25 said predicted image synthesizing unit 

comprises a computing unit for calculating the motion 
vectors of a point at coordinates {x+w, j) by 
performing bilinear interpolation/extrapolation on the 



motion vectors of representative points situated at 
coordinates (i, j), (i+p, j) of an image where the 
pixel sampling interval in both the horizontal and 
vertical directions is 1 and the horizontal and 

5 vertical coordinates of the sampling points are 

obtained by adding to integhers (where w=wn/wd, wn is 
a non-negative integer, wd is a hw power of 2, hw is a 
non-negative integer and wn<wd) , calculating the 
horizontal and vertical components of motion vectors at 

10 the coordinates (x+w, j+g) as numerical values which 

are respectively integral multiples of 1/z (where z is 
the hz power of 2, and hz is a non-negative integer) by 
linear interpolation/extrapolation of the motion 
vectors of the representative points at the coordinates 

15 (i, j+q) , (i+Pf j +c 3) r and calculating the horizontal 

and vertical components of the motion vectors of the 
pixel at the coordinates (x+w, y+w) as values which are 
respectively integral multiples of 1/m (where m is the 
hm power of 2, and hm is a non-negative integer) by 

20 linear interpolation/extrapolation of the aforesaid two 
motion vectors at the coordinates (x+w, j), (x+w, j+p) , 
and 

a synthesizing unit for synthesizing a 
predicted image from the motion vectors of a pixel 
25 situated at the aforesaid coordinates (x+w, y+w) and 

the decoded image of said immediately preceding frame. 



31. An encoding device as defined in Claim 29, wherein, 
when the motion vectors of a pixel at the coordinates 
(x+w, y+w) are found using (uO, vO) , (ul, vl) , (u2, v2), 
(u3, v3) , which are the horizontal and vertical 
5 components of the motion vectors of the representative 
points at the coordinates (i,j), (i+p, j), (i, j+q), 
(i+p, j+q) multiplied by k, 

(uL(y+w), vL(y+w)) which are the horizontal and 
vertical components of the motion vectors at a point 
10 having the coordinates (i, y+w) multiplied by z, are 
found by calculating: 
uL (y+w) = ( ( (q. wd-y . wd-wn) u0+ (y . wd+wn) u2 ) z) //// (q.k.wd) , 
vL (y+w) = ( ( (q. wd-y . wd- 
wn) v0+ (y. wd+wn) v2 ) z ) //// (q. k. wd) 
15 (where [////] is a division wherein the computation 
result is rounded to the nearest integer when the 
result of an ordinary division is not an integer, and 
the order of computational priority is equivalent to 
multiplication and division) , 
20 uR(y+w), vR(y+w)) which are the horizontal and 

vertical components of the motion vector at a point 
having the coordinates (i+p, y+w) multiplied by z, are 
found by calculating: 

uR (y+w) = ( ( (q. wd-y . wd- 
25 wn) ul+ (y. wd+wn) u3) z) //// (q. k.wd) vR ( y+w) =(( (p . wd-y . wd- 
wn) vl+ (y. wd+wn) v3)z)////(q.k.wd), and 

(u(x+w), y+w), v(x+w, y+w)) which are the 
horizontal and vertical components of the motion vector 
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of a pixel at the coordinates (x+w, y+w) multiplied by 
m, are found by calculating: 

u(x+w, y+w) = ( ( (p. wd-x. wd- 
wn) uL (y+w) + (x . wd+wn) uR (y+w) )m) // (p.z.wd) 

5 v(x+w, y+w) = ( ( (p. wd-x.wd- 

wn) vL (y+w) + (x. wd+wn) vR (y+w) ) m) // (p . z . wd) 
(where [//] is a division wherein the computation 
result is rounded to the nearest integer when the 
result of an ordinary division is not an integer, and 

10 the order of priority is equivalent to multiplication 
and division) . 

32. An interframe predicted image encoding device as 
defined in Claim 30 , wherein, when the motion vectors 
15 of a pixel at the coordinates (x+w, y+w) are found 

using (uO, vO), (ul, vl), (u2, v2), (u3, v3 ) , which are 
the horizontal and vertical components of the motion 
vectors of the representative points at the coordinates 
(i,j), (i+p, j)/ (i, j+q), (i+P, j+q) multiplied by k, 
20 (uT(x+w), vT(x+w)) which are the horizontal and 

vertical components of the motion vectors at a point 
having the coordinates (x+w, j) multiplied by z, are 
found by calculating: 

uT (x+w) = ( ( (p. wd-x. wd- 
25 wn) u0+ (x. wd+wn) ul) z) //// (p. k. wd) , 
vT (x+w) = ( ( (p. wd-x.wd- 
wn) v0+ (x. wd+wn) vl) z) //// (p. k. wd) 
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(where [////] is a division wherein the computation 
result is rounded to the nearest integer when the 
result of an ordinary division is not an integer, and 
the order of computational priority is equivalent to 
5 multiplication and division) , 

uB(y+w), vB(y+w)) which are the horizontal and 
vertical components of the motion vectors at a point 
having the coordinates (x+w, j+p) multiplied by z, are 
found by calculating: 
10 uB (x+w)= ( { (p.wd-x.wd- 

wn)u2+(x.wd+wn)u3) z) //// (p.k.wd) vB(x+w)=( ( (p.wd-x.wd- 
wn) v2+ (x.wd+wn) v3) z) //// (p. k. wd) , and 

(u(x+w), y+w), v(x+w, y+w) ) which are the 
horizontal and vertical components of the motion 
15 vectors of a pixel at the coordinates (x+w, y+w) 
multiplied by m, are found by calculating: 

u(x+w, y+w) = ( ( (q. wd-y.wd- 
wn) uT (x+w) + (y . wd+wn) uB (x+w) ) m) // (q. z . wd) 
v(x+w, y+w) = ( ( (q. wd-y. wd- 
20 wn) vT (x+w) + (y. wd+wn) vB (x+w) ) m) // (q. z .wd) 

(where [//] is a division wherein the computation 
result is rounded to the nearest integer when the 
result of an ordinary division is not an integer, and 
the order of priority is equivalent to multiplication 
25 and division) . 



33. An interframe predicted image encoding device as 
defined in Claim 29, wherein the absolute value of p is 
the a power of 2 (where Oi is a non-negative integer) . 



5 34. An interframe predicted image encoding device as 

defined in Claim 30, wherein the absolute value of q is 
the |3 power of 2 (where j3 is a non-negative integer) . 

35. An encoding device as defined in Claim 29, wherein 
10 the absolute values of p and q are respectively the a 

power of 2 and j3 power of 2 (where a , IS are non- 
negative integers) . 

36. An encoding device as defined in Claim 30, wherein 
15 the absolute values of p and q are respectively the a 

power of 2 and j3 power of 2 (where a, (3 are non- 
negative integers) . 

37. An encoding device as defined in Claim 33, wherein 
20 a+hz is a positive integral multiple of 8, and w is 0. 

38. An encoding device as defined in Claim 34, wherein 
j3+hz is a positive integral multiple of 8, and w is 0. 

25 39. An encoding device as defined in Claim 33, wherein 
a+hz+hw is a positive integral multiple of 8, and w>0. 
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40. An encoding device as defined in Claim 34, wherein 
j6 +hz+hw is a positive integral multiple of 8, and w>0. 

41. An encoding device as defined in Claim 37, wherein 
5 the value of hz is varied according to the value of a 

so that a+ hz is 16 or less for plural different 
values of a . 

42. An encoding device as defined in Claim 38, wherein 
10 the value of hz is varied according to the value of B 

so that /3+hz is 16 or less for plural different values 
of j3 . 

43. An encoding device as defined in Claim 39, wherein 
15 the value of hz is varied according to the value of a 

so that a+hz+hw is 16 or less for plural different 
values of a . 

44. An encoding device as defined in Claim 40, wherein 
20 the value of hz is varied according to the value of /3 

so that j3+hz+hw is 16 or less for plural different 
values of (3 . 

45. An encoding device as defined in any of Claims 2 9 
25 to 40, wherein said motion compensating unit further 

comprises means for encoding information relating to 
motion vectors of said representative points. 
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46. An encoding device as defined in any of Claims 29 
to 40, wherein said representative points are points at 
the corners of an image. 



5 47. An encoding device as defined in any of Claims 29 
to 40, wherein said first transforming unit and second 
transforming unit are respectively a circuit which 
applies a DCT transformation and quantizes the signal 
of said differential image, and a circuit which 
10 performs inverse quantization and an inverse DCT 
transformation . 

48. An image decoding device comprising a transforming 
unit for transforming a signal of a differential image 

15 of an interframe differential code of an encoded image 
signal, a frame memory for storing a decoded frame 
image signal, a predicted image synthesizing unit for 
inputting motion vectors of said encoded image signal 
and the decoded frame image signal of said frame memory, 

20 and synthesizing a predicted image, an adding unit for 
adding the output of said predicted image synthesizing 
unit and the output of said transforming circuit to 
produce a decoded image, and means for storing the 
output of said adding unit in said frame memory, 

25 wherein said predicted image synthesizing unit 
comprises means for synthesizing an interframe 
predicted image as defined in any of Claims 1 to 16. 



49. A storage medium on which software has been 
recorded for implementing the method of synthesizing an 
interframe predicted image as defined in any of Claims 

1 to 22. 

5 

50. A storage medium on which software has been 
recorded for driving an image decoding device as 
defined in Claim 48. 

10 51. A storage medium on which an encoded bit stream 

generated by an encoding method as defined in Claim 23, 

24 or 25 has been 

recorded. 

15 52 . A storage medium on which a compressed, encoded bit 
stream which can be decoded by an image decoding method 
as defined in Claim 26, 27 or 28, has been recorded. 
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ABSTRACT 

A method is provided to simplify the 
computation involved in global motion compensation and 
warping prediction in coding and decoding of motion 
5 compensated image signals. In the synthesis of a 

global motion compensated predicted image 1203 which 
predicts a current frame image from an immediately 
preceding frame image 1202 using motion vectors 1205 of 
plural representative points having a particular 

10 spatial interval in an image frame, a first 

interpolation/extrapolation 1207 is performed to 
calculate motion vectors of provisional representative 
points from motion vectors 1206 of representative 
points, and a second interpolation/extrapolation 1209 

15 is then performed to calculate motion vectors 1210 for 
each pixel from motion vectors 1208 of the provisional 
representative points. 

A division performed when the predicted image 
is synthesized can be replaced by a shift computation 

20 involving a smaller number of bits, so the processing 
carried out by a computer or special hardware can be 
simplified. 
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